Iterative-improvement-based declustering heuristics for multi-disk databases

نویسندگان

  • Mehmet Koyutürk
  • Cevdet Aykanat
چکیده

Data declustering is an important issue for reducing query response times in multi-disk database systems. In this paper, we propose a declustering method that utilizes the available information on query distribution, data distribution, data-item sizes, and disk capacity constraints. The proposed method exploits the natural correspondence between a data set with a given query distribution and a hypergraph. We define an objective function that exactly represents the aggregate parallel query-response time for the declustering problem and adapt the iterative-improvement-based heuristics successfully used in hypergraph partitioning to this objective function. We propose a two-phase algorithm that first obtains an initial K-way declustering by recursively bipartitioning the data set, then applies multi-way refinement on this declustering. We provide effective gain models and efficient implementation schemes for both phases. The experimental results on a wide range of realistic data sets show that the proposed method provides a significant performance improvement compared with the state-of-the-art declustering strategy based on similarity-graph partitioning. r 2003 Elsevier Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Declustering Objects for Visualization

In this paper we propose a new declustering method which is particularly suitable for image and cartographic databases used for visualization. Our declustering method is based on algebraic techniques using vectors. The algorithm which computes the disk assignment requires O(Kj log K) time where K is the number of parallel disks in the system. The resulting disk assignment maximizes the area tha...

متن کامل

Scalability Analysis of Declustering Methods for Cartesian Product Files

Efficient storage and retrieval of multi-attribute datasets has become one of the essential requirements for many data-intensive applications. The Cartesian product file has been known as an effective multi-attribute file structure for partial-match and best-match queries. Several heuristic methods have been developed to decluster Cartesian product files over multiple disks to obtain high perfo...

متن کامل

Declustering Databases on Heterogeneous Disk Systems

Declustering is a well known strategy to achieve maximum I/O parallelism in multidisk systems. Many declustering methods have been proposed for symmetrical disk systems, i.e, multi-disk systems in which all disks have the same speed and capacity. This work deals with the problem of adapting such declustering methods to work in heterogeneous environments. In such environments there are many type...

متن کامل

cient Disk Allocation for Fast Similarity Searching

As databases increasingly integrate non-textual information it is becoming necessary to support eecient similarity searching in addition to range searching. Recently, declustering techniques have been proposed for improving the performance of similarity searches through parallel I/O. In this paper, we propose a new scheme which provides good declus-tering for similarity searching. In particular...

متن کامل

Concentric Hyperspaces and Disk Allocation for Fast Parallel Range Searching

Data partitioning and declustering have been extensively used in the past to parallelize I/O for range queries. Numerous declustering and disk allocation techniques have been proposed in the literature. However, most of these techniques were primarily designed for two-dimensional data and for balanced partitioning of the data space. As databases increasingly integrate multimedia information in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Syst.

دوره 30  شماره 

صفحات  -

تاریخ انتشار 2005